Fast Parallel Randomized Algorithm for Nonnegative Matrix Factorization with KL Divergence for Large Sparse Datasets
نویسندگان
چکیده
Nonnegative Matrix Factorization (NMF) with Kullback-Leibler Divergence (NMF-KL) is one of the most significant NMF problems and equivalent to Probabilistic Latent Semantic Indexing (PLSI), which has been successfully applied in many applications. For sparse count data, a Poisson distribution and KL divergence provide sparse models and sparse representation, which describe the random variation better than a normal distribution and Frobenius norm. Specially, sparse models provide more concise understanding of the appearance of attributes over latent components, while sparse representation provides concise interpretability of the contribution of latent components over instances. However, minimizing NMF with KL divergence is much more difficult than minimizing NMF with Frobenius norm; and sparse models, sparse representation and fast algorithms for large sparse datasets are still challenges for NMF with KL divergence. In this paper, we propose a fast parallel randomized coordinate descent algorithm having fast convergence for large sparse datasets to archive sparse models and sparse representation. The proposed algorithm’s experimental results overperform the current studies’ ones in this problem.
منابع مشابه
Projective Nonnegative Matrix Factorization with α-Divergence
A new matrix factorization algorithm which combines two recently proposed nonnegative learning techniques is presented. Our new algorithm, α-PNMF, inherits the advantages of Projective Nonnegative Matrix Factorization (PNMF) for learning a highly orthogonal factor matrix. When the Kullback-Leibler (KL) divergence is generalized to αdivergence, it gives our method more flexibility in approximati...
متن کاملKullback-Leibler Divergence for Nonnegative Matrix Factorization
The I-divergence or unnormalized generalization of KullbackLeibler (KL) divergence is commonly used in Nonnegative Matrix Factorization (NMF). This divergence has the drawback that its gradients with respect to the factorizing matrices depend heavily on the scales of the matrices, and learning the scales in gradient-descent optimization may require many iterations. This is often handled by expl...
متن کاملOnline Blind Separation of Dependent Sources Using Nonnegative Matrix Factorization Based on KL Divergence
This paper proposes a novel online algorithm for nonnegative matrix factorization (NMF) based on the generalized Kullback-Leibler (KL) divergence criterion, aimed to overcome the high computation problem of large-scale data brought about by conventional batch NMF algorithms. It features stable updating the factors alternately for each new-coming observation, and provides an efficient solution f...
متن کاملOnline Projective Nonnegative Matrix Factorization for Large Datasets
Projective Nonnegative Matrix Factorization (PNMF) is one of the recent methods for computing low-rank approximations to data matrices. It is advantageous in many practical application domains such as clustering, graph partitioning, or sparse feature extraction. However, up to now a scalable implementation of PNMF for large-scale machine learning problems has been lacking. Here we provide an on...
متن کاملAccelerated parallel and distributed algorithm using limited internal memory for nonnegative matrix factorization
Nonnegative matrix factorization (NMF) is a powerful technique for dimension reduction, extracting latent factors and learning part-based representation. For large datasets, NMF performance depends on some major issues such as fast algorithms, fully parallel distributed feasibility and limited internal memory. This research designs a fast fully parallel and distributed algorithm using limited i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1604.04026 شماره
صفحات -
تاریخ انتشار 2016